Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 8578 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.2 MiB |
| Average record size in memory | 152.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 10 |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
CODE_GENDER is highly correlated with FLAG_OWN_CAR and 1 other fields | High correlation |
FLAG_OWN_CAR is highly correlated with CODE_GENDER | High correlation |
NAME_INCOME_TYPE is highly correlated with OCCUPATION_TYPE and 2 other fields | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
OCCUPATION_TYPE is highly correlated with CODE_GENDER and 1 other fields | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
AGE is highly correlated with NAME_INCOME_TYPE | High correlation |
YEARS_EMPLOYED is highly correlated with NAME_INCOME_TYPE | High correlation |
STATUS is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
ID has unique values | Unique |
OCCUPATION_TYPE has 294 (3.4%) zeros | Zeros |
YEARS_EMPLOYED has 1351 (15.7%) zeros | Zeros |
Reproduction
| Analysis started | 2022-05-07 14:46:02.207852 |
|---|---|
| Analysis finished | 2022-05-07 14:46:28.147763 |
| Duration | 25.94 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 8578 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19018.26743 |
| Minimum | 0 |
|---|---|
| Maximum | 36452 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1905.7 |
| Q1 | 9663 |
| median | 19000.5 |
| Q3 | 28510.75 |
| 95-th percentile | 35775.6 |
| Maximum | 36452 |
| Range | 36452 |
| Interquartile range (IQR) | 18847.75 |
Descriptive statistics
| Standard deviation | 10830.64959 |
|---|---|
| Coefficient of variation (CV) | 0.5694866595 |
| Kurtosis | -1.207944468 |
| Mean | 19018.26743 |
| Median Absolute Deviation (MAD) | 9444 |
| Skewness | -0.04493941555 |
| Sum | 163138698 |
| Variance | 117302970.5 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 25476 | 1 | < 0.1% |
| 25533 | 1 | < 0.1% |
| 25532 | 1 | < 0.1% |
| 25531 | 1 | < 0.1% |
| 25515 | 1 | < 0.1% |
| 25514 | 1 | < 0.1% |
| 25513 | 1 | < 0.1% |
| 25512 | 1 | < 0.1% |
| 25509 | 1 | < 0.1% |
| Other values (8568) | 8568 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 16 | 1 | |
| 18 | 1 | |
| 19 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 22 | 1 | |
| 28 | 1 | |
| 32 | 1 |
| Value | Count | Frequency (%) |
| 36452 | 1 | |
| 36451 | 1 | |
| 36450 | 1 | |
| 36449 | 1 | |
| 36448 | 1 | |
| 36447 | 1 | |
| 36446 | 1 | |
| 36445 | 1 | |
| 36444 | 1 | |
| 36443 | 1 |
| Distinct | 8578 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5079032.727 |
| Minimum | 5008804 |
|---|---|
| Maximum | 5150473 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 5008804 |
|---|---|
| 5-th percentile | 5020787.8 |
| Q1 | 5044488.75 |
| median | 5078897 |
| Q3 | 5115682.25 |
| 95-th percentile | 5146112.2 |
| Maximum | 5150473 |
| Range | 141669 |
| Interquartile range (IQR) | 71193.5 |
Descriptive statistics
| Standard deviation | 41866.87599 |
|---|---|
| Coefficient of variation (CV) | 0.008243080572 |
| Kurtosis | -1.211133521 |
| Mean | 5079032.727 |
| Median Absolute Deviation (MAD) | 36661.5 |
| Skewness | 0.05718306322 |
| Sum | 4.356794273 × 1010 |
| Variance | 1752835306 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5008804 | 1 | < 0.1% |
| 5105145 | 1 | < 0.1% |
| 5105222 | 1 | < 0.1% |
| 5105221 | 1 | < 0.1% |
| 5105220 | 1 | < 0.1% |
| 5105196 | 1 | < 0.1% |
| 5105195 | 1 | < 0.1% |
| 5105194 | 1 | < 0.1% |
| 5105193 | 1 | < 0.1% |
| 5105190 | 1 | < 0.1% |
| Other values (8568) | 8568 |
| Value | Count | Frequency (%) |
| 5008804 | 1 | |
| 5008805 | 1 | |
| 5008823 | 1 | |
| 5008825 | 1 | |
| 5008826 | 1 | |
| 5008827 | 1 | |
| 5008830 | 1 | |
| 5008831 | 1 | |
| 5008832 | 1 | |
| 5008839 | 1 |
| Value | Count | Frequency (%) |
| 5150473 | 1 | |
| 5150467 | 1 | |
| 5150466 | 1 | |
| 5150464 | 1 | |
| 5150463 | 1 | |
| 5150459 | 1 | |
| 5150423 | 1 | |
| 5150417 | 1 | |
| 5150414 | 1 | |
| 5150412 | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 5656 | |
| 1 | 2922 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 5656 | |
| 1 | 2922 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 5406 | |
| 1 | 3172 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 5406 | |
| 1 | 3172 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
FLAG_OWN_REALTY
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 5600 | |
| 0 | 2978 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 5600 | |
| 0 | 2978 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
AMT_INCOME_TOTAL
Real number (ℝ≥0)
| Distinct | 184 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 190041.0654 |
| Minimum | 27000 |
|---|---|
| Maximum | 1575000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 27000 |
|---|---|
| 5-th percentile | 76500 |
| Q1 | 121500 |
| median | 162000 |
| Q3 | 225000 |
| 95-th percentile | 360000 |
| Maximum | 1575000 |
| Range | 1548000 |
| Interquartile range (IQR) | 103500 |
Descriptive statistics
| Standard deviation | 108333.0094 |
|---|---|
| Coefficient of variation (CV) | 0.5700505264 |
| Kurtosis | 22.28318222 |
| Mean | 190041.0654 |
| Median Absolute Deviation (MAD) | 49500 |
| Skewness | 3.225052512 |
| Sum | 1630172259 |
| Variance | 1.173604092 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 135000 | 997 | 11.6% |
| 225000 | 704 | 8.2% |
| 157500 | 704 | 8.2% |
| 180000 | 692 | 8.1% |
| 112500 | 675 | 7.9% |
| 202500 | 589 | 6.9% |
| 270000 | 400 | 4.7% |
| 90000 | 397 | 4.6% |
| 315000 | 239 | 2.8% |
| 247500 | 233 | 2.7% |
| Other values (174) | 2948 |
| Value | Count | Frequency (%) |
| 27000 | 3 | |
| 29250 | 1 | < 0.1% |
| 31500 | 4 | |
| 32400 | 2 | |
| 33300 | 4 | |
| 36000 | 1 | < 0.1% |
| 36900 | 3 | |
| 37800 | 2 | |
| 38250 | 1 | < 0.1% |
| 39600 | 2 |
| Value | Count | Frequency (%) |
| 1575000 | 2 | < 0.1% |
| 1350000 | 6 | |
| 1125000 | 1 | < 0.1% |
| 990000 | 2 | < 0.1% |
| 945000 | 1 | < 0.1% |
| 900000 | 14 | |
| 810000 | 6 | |
| 765000 | 2 | < 0.1% |
| 742500 | 1 | < 0.1% |
| 720000 | 4 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 4 | |
|---|---|
| 0 | |
| 1 | |
| 2 | |
| 3 | 3 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4 |
|---|---|
| 2nd row | 4 |
| 3rd row | 0 |
| 4th row | 4 |
| 5th row | 4 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 4390 | |
| 0 | 2091 | |
| 1 | 1368 | 15.9% |
| 2 | 726 | 8.5% |
| 3 | 3 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 4 | 4390 | |
| 0 | 2091 | |
| 1 | 1368 | 15.9% |
| 2 | 726 | 8.5% |
| 3 | 3 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
NAME_EDUCATION_TYPE
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 4 | |
|---|---|
| 1 | |
| 2 | 395 |
| 3 | 79 |
| 0 | 13 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 4 |
| 4th row | 2 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 5754 | |
| 1 | 2337 | |
| 2 | 395 | 4.6% |
| 3 | 79 | 0.9% |
| 0 | 13 | 0.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 4 | 5754 | |
| 1 | 2337 | |
| 2 | 395 | 4.6% |
| 3 | 79 | 0.9% |
| 0 | 13 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 1 | |
|---|---|
| 3 | |
| 0 | |
| 2 | 486 |
| 4 | 337 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 5832 | |
| 3 | 1199 | 14.0% |
| 0 | 724 | 8.4% |
| 2 | 486 | 5.7% |
| 4 | 337 | 3.9% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 5832 | |
| 3 | 1199 | 14.0% |
| 0 | 724 | 8.4% |
| 2 | 486 | 5.7% |
| 4 | 337 | 3.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
NAME_HOUSING_TYPE
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.298671019 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 41 |
| Zeros (%) | 0.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.972443261 |
|---|---|
| Coefficient of variation (CV) | 0.7487987696 |
| Kurtosis | 8.771831058 |
| Mean | 1.298671019 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.180293341 |
| Sum | 11140 |
| Variance | 0.945645896 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 7599 | |
| 5 | 435 | 5.1% |
| 2 | 289 | 3.4% |
| 4 | 146 | 1.7% |
| 3 | 68 | 0.8% |
| 0 | 41 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 41 | 0.5% |
| 1 | 7599 | |
| 2 | 289 | 3.4% |
| 3 | 68 | 0.8% |
| 4 | 146 | 1.7% |
| 5 | 435 | 5.1% |
| Value | Count | Frequency (%) |
| 5 | 435 | 5.1% |
| 4 | 146 | 1.7% |
| 3 | 68 | 0.8% |
| 2 | 289 | 3.4% |
| 1 | 7599 | |
| 0 | 41 | 0.5% |
FLAG_WORK_PHONE
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 6664 | |
| 1 | 1914 | 22.3% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 6664 | |
| 1 | 1914 | 22.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
FLAG_PHONE
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 6077 | |
| 1 | 2501 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 6077 | |
| 1 | 2501 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
FLAG_EMAIL
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 7732 | |
| 1 | 846 | 9.9% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 7732 | |
| 1 | 846 | 9.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 19 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.131732339 |
| Minimum | 0 |
|---|---|
| Maximum | 18 |
| Zeros | 294 |
| Zeros (%) | 3.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 6 |
| median | 10 |
| Q3 | 12 |
| 95-th percentile | 15 |
| Maximum | 18 |
| Range | 18 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.316702589 |
|---|---|
| Coefficient of variation (CV) | 0.4727145331 |
| Kurtosis | -0.726897615 |
| Mean | 9.131732339 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.4047553466 |
| Sum | 78332 |
| Variance | 18.63392124 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 2546 | |
| 8 | 1452 | |
| 3 | 863 | 10.1% |
| 15 | 831 | 9.7% |
| 10 | 754 | 8.8% |
| 4 | 498 | 5.8% |
| 6 | 339 | 4.0% |
| 11 | 303 | 3.5% |
| 0 | 294 | 3.4% |
| 17 | 162 | 1.9% |
| Other values (9) | 536 | 6.2% |
| Value | Count | Frequency (%) |
| 0 | 294 | 3.4% |
| 1 | 131 | 1.5% |
| 2 | 160 | 1.9% |
| 3 | 863 | |
| 4 | 498 | 5.8% |
| 5 | 23 | 0.3% |
| 6 | 339 | 4.0% |
| 7 | 19 | 0.2% |
| 8 | 1452 | |
| 9 | 49 | 0.6% |
| Value | Count | Frequency (%) |
| 18 | 41 | 0.5% |
| 17 | 162 | 1.9% |
| 16 | 31 | 0.4% |
| 15 | 831 | 9.7% |
| 14 | 17 | 0.2% |
| 13 | 65 | 0.8% |
| 12 | 2546 | |
| 11 | 303 | 3.5% |
| 10 | 754 | 8.8% |
| 9 | 49 | 0.6% |
CNT_FAM_MEMBERS
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 8 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.193984612 |
| Minimum | 1 |
|---|---|
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 4 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9093973655 |
|---|---|
| Coefficient of variation (CV) | 0.4144957812 |
| Kurtosis | 1.518107344 |
| Mean | 2.193984612 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.9470341408 |
| Sum | 18820 |
| Variance | 0.8270035684 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 4566 | |
| 1 | 1681 | 19.6% |
| 3 | 1458 | 17.0% |
| 4 | 757 | 8.8% |
| 5 | 100 | 1.2% |
| 6 | 11 | 0.1% |
| 7 | 3 | < 0.1% |
| 9 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 1681 | 19.6% |
| 2 | 4566 | |
| 3 | 1458 | 17.0% |
| 4 | 757 | 8.8% |
| 5 | 100 | 1.2% |
| 6 | 11 | 0.1% |
| 7 | 3 | < 0.1% |
| 9 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 9 | 2 | < 0.1% |
| 7 | 3 | < 0.1% |
| 6 | 11 | 0.1% |
| 5 | 100 | 1.2% |
| 4 | 757 | 8.8% |
| 3 | 1458 | 17.0% |
| 2 | 4566 | |
| 1 | 1681 | 19.6% |
| Distinct | 4128 |
|---|---|
| Distinct (%) | 48.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 43.23082104 |
| Minimum | 21.09557349 |
|---|---|
| Maximum | 68.86383704 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 21.09557349 |
|---|---|
| 5-th percentile | 26.96564611 |
| Q1 | 33.43463589 |
| median | 41.8872393 |
| Q3 | 52.56781453 |
| 95-th percentile | 62.87329651 |
| Maximum | 68.86383704 |
| Range | 47.76826355 |
| Interquartile range (IQR) | 19.13317864 |
Descriptive statistics
| Standard deviation | 11.55732585 |
|---|---|
| Coefficient of variation (CV) | 0.2673399573 |
| Kurtosis | -1.041837696 |
| Mean | 43.23082104 |
| Median Absolute Deviation (MAD) | 9.384176266 |
| Skewness | 0.2444028127 |
| Sum | 370833.9829 |
| Variance | 133.5717808 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 34.70570922 | 16 | 0.2% |
| 42.48957884 | 15 | 0.2% |
| 46.25967679 | 13 | 0.2% |
| 60.14360322 | 13 | 0.2% |
| 30.96299034 | 12 | 0.1% |
| 30.68098592 | 12 | 0.1% |
| 37.4709953 | 12 | 0.1% |
| 36.81389762 | 11 | 0.1% |
| 40.21164021 | 11 | 0.1% |
| 62.27917069 | 11 | 0.1% |
| Other values (4118) | 8452 |
| Value | Count | Frequency (%) |
| 21.09557349 | 1 | |
| 21.23794465 | 1 | |
| 21.79100187 | 1 | |
| 22.05110303 | 1 | |
| 22.05657885 | 2 | |
| 22.27013559 | 1 | |
| 22.30025257 | 1 | |
| 22.3112042 | 1 | |
| 22.33310746 | 1 | |
| 22.36322443 | 1 |
| Value | Count | Frequency (%) |
| 68.86383704 | 1 | |
| 68.83098216 | 1 | |
| 68.71872797 | 1 | |
| 68.01782377 | 1 | |
| 67.95485191 | 1 | |
| 67.9493761 | 1 | |
| 67.91378331 | 1 | |
| 67.85628726 | 1 | |
| 67.84533563 | 1 | |
| 67.78236377 | 1 |
| Distinct | 2420 |
|---|---|
| Distinct (%) | 28.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.956858019 |
| Minimum | 0 |
|---|---|
| Maximum | 42.87836164 |
| Zeros | 1351 |
| Zeros (%) | 15.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1.137600361 |
| median | 4.178046093 |
| Q3 | 8.512152885 |
| 95-th percentile | 19.29676859 |
| Maximum | 42.87836164 |
| Range | 42.87836164 |
| Interquartile range (IQR) | 7.374552523 |
Descriptive statistics
| Standard deviation | 6.437979503 |
|---|---|
| Coefficient of variation (CV) | 1.08076766 |
| Kurtosis | 4.18633369 |
| Mean | 5.956858019 |
| Median Absolute Deviation (MAD) | 3.455238643 |
| Skewness | 1.819572645 |
| Sum | 51097.92809 |
| Variance | 41.44758008 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1351 | 15.7% |
| 1.09790071 | 24 | 0.3% |
| 0.5475814014 | 20 | 0.2% |
| 4.213638884 | 19 | 0.2% |
| 1.927486533 | 18 | 0.2% |
| 5.212974941 | 17 | 0.2% |
| 3.438811201 | 17 | 0.2% |
| 7.515554734 | 17 | 0.2% |
| 4.632538656 | 16 | 0.2% |
| 8.854391261 | 15 | 0.2% |
| Other values (2410) | 7064 |
| Value | Count | Frequency (%) |
| 0 | 1351 | |
| 0.04654441912 | 2 | < 0.1% |
| 0.1779639555 | 1 | < 0.1% |
| 0.1916534905 | 1 | < 0.1% |
| 0.1943913975 | 1 | < 0.1% |
| 0.1998672115 | 2 | < 0.1% |
| 0.2436737236 | 2 | < 0.1% |
| 0.2491495376 | 2 | < 0.1% |
| 0.2518874446 | 1 | < 0.1% |
| 0.2546253516 | 5 | 0.1% |
| Value | Count | Frequency (%) |
| 42.87836164 | 1 | < 0.1% |
| 41.17264557 | 3 | |
| 40.75922161 | 1 | < 0.1% |
| 40.54840277 | 1 | < 0.1% |
| 40.45257603 | 2 | |
| 39.79821625 | 1 | < 0.1% |
| 39.62572811 | 3 | |
| 38.37998042 | 2 | |
| 36.7290225 | 1 | < 0.1% |
| 36.63867157 | 1 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.1 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 4289 | |
| 0 | 4289 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 4289 | |
| 0 | 4289 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
MONTHS_BALANCE
Real number (ℝ≥0)
| Distinct | 61 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.50373047 |
| Minimum | 0 |
|---|---|
| Maximum | 60 |
| Zeros | 46 |
| Zeros (%) | 0.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.1 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 14 |
| median | 26 |
| Q3 | 40 |
| 95-th percentile | 55 |
| Maximum | 60 |
| Range | 60 |
| Interquartile range (IQR) | 26 |
Descriptive statistics
| Standard deviation | 16.15204219 |
|---|---|
| Coefficient of variation (CV) | 0.5872673238 |
| Kurtosis | -1.043031607 |
| Mean | 27.50373047 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 0.2170895007 |
| Sum | 235927 |
| Variance | 260.8884669 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 209 | 2.4% |
| 11 | 196 | 2.3% |
| 17 | 193 | 2.2% |
| 16 | 190 | 2.2% |
| 14 | 190 | 2.2% |
| 39 | 189 | 2.2% |
| 23 | 187 | 2.2% |
| 22 | 183 | 2.1% |
| 25 | 182 | 2.1% |
| 7 | 179 | 2.1% |
| Other values (51) | 6680 |
| Value | Count | Frequency (%) |
| 0 | 46 | 0.5% |
| 1 | 90 | |
| 2 | 107 | |
| 3 | 131 | |
| 4 | 134 | |
| 5 | 166 | |
| 6 | 167 | |
| 7 | 179 | |
| 8 | 155 | |
| 9 | 176 |
| Value | Count | Frequency (%) |
| 60 | 86 | |
| 59 | 74 | |
| 58 | 67 | |
| 57 | 74 | |
| 56 | 93 | |
| 55 | 98 | |
| 54 | 85 | |
| 53 | 106 | |
| 52 | 115 | |
| 51 | 112 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Unnamed: 0 | ID | CODE_GENDER | FLAG_OWN_CAR | FLAG_OWN_REALTY | AMT_INCOME_TOTAL | NAME_INCOME_TYPE | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | FLAG_WORK_PHONE | FLAG_PHONE | FLAG_EMAIL | OCCUPATION_TYPE | CNT_FAM_MEMBERS | AGE | YEARS_EMPLOYED | STATUS | MONTHS_BALANCE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 5008804 | 1 | 1 | 1 | 427500.0 | 4 | 1 | 0 | 4 | 1 | 0 | 0 | 12 | 2 | 32.868574 | 12.435574 | 1 | 15 |
| 1 | 1 | 5008805 | 1 | 1 | 1 | 427500.0 | 4 | 1 | 0 | 4 | 1 | 0 | 0 | 12 | 2 | 32.868574 | 12.435574 | 1 | 14 |
| 2 | 16 | 5008823 | 1 | 1 | 1 | 135000.0 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | 8 | 2 | 48.674511 | 3.269061 | 0 | 7 |
| 3 | 18 | 5008825 | 0 | 1 | 0 | 130500.0 | 4 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 2 | 29.210730 | 3.019911 | 1 | 25 |
| 4 | 19 | 5008826 | 0 | 1 | 0 | 130500.0 | 4 | 2 | 1 | 1 | 0 | 0 | 0 | 0 | 2 | 29.210730 | 3.019911 | 1 | 30 |
| 5 | 20 | 5008830 | 0 | 0 | 1 | 157500.0 | 4 | 4 | 1 | 1 | 0 | 1 | 0 | 8 | 2 | 27.463945 | 4.021985 | 1 | 31 |
| 6 | 21 | 5008831 | 0 | 0 | 1 | 157500.0 | 4 | 4 | 1 | 1 | 0 | 1 | 0 | 8 | 2 | 27.463945 | 4.021985 | 1 | 19 |
| 7 | 22 | 5008832 | 0 | 0 | 1 | 157500.0 | 4 | 4 | 1 | 1 | 0 | 1 | 0 | 8 | 2 | 27.463945 | 4.021985 | 1 | 34 |
| 8 | 28 | 5008839 | 1 | 0 | 1 | 405000.0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 10 | 3 | 32.422295 | 5.519621 | 0 | 13 |
| 9 | 32 | 5008843 | 1 | 0 | 1 | 405000.0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 10 | 3 | 32.422295 | 5.519621 | 0 | 29 |
Last rows
| Unnamed: 0 | ID | CODE_GENDER | FLAG_OWN_CAR | FLAG_OWN_REALTY | AMT_INCOME_TOTAL | NAME_INCOME_TYPE | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | FLAG_WORK_PHONE | FLAG_PHONE | FLAG_EMAIL | OCCUPATION_TYPE | CNT_FAM_MEMBERS | AGE | YEARS_EMPLOYED | STATUS | MONTHS_BALANCE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8568 | 36443 | 5149145 | 1 | 1 | 1 | 247500.0 | 4 | 4 | 1 | 1 | 1 | 0 | 0 | 8 | 2 | 29.985558 | 9.793493 | 1 | 25 |
| 8569 | 36444 | 5149158 | 1 | 1 | 1 | 247500.0 | 4 | 4 | 1 | 1 | 1 | 0 | 0 | 8 | 2 | 29.985558 | 9.793493 | 1 | 28 |
| 8570 | 36445 | 5149190 | 1 | 1 | 0 | 450000.0 | 4 | 1 | 1 | 1 | 0 | 1 | 1 | 3 | 3 | 26.960170 | 1.374429 | 1 | 11 |
| 8571 | 36446 | 5149729 | 1 | 1 | 1 | 90000.0 | 4 | 4 | 1 | 1 | 0 | 0 | 0 | 12 | 2 | 52.296762 | 4.711938 | 1 | 21 |
| 8572 | 36447 | 5149775 | 0 | 1 | 1 | 130500.0 | 4 | 4 | 1 | 1 | 0 | 1 | 0 | 8 | 2 | 44.181605 | 25.711685 | 1 | 19 |
| 8573 | 36448 | 5149828 | 1 | 1 | 1 | 315000.0 | 4 | 4 | 1 | 1 | 0 | 0 | 0 | 10 | 2 | 47.497211 | 6.625735 | 1 | 11 |
| 8574 | 36449 | 5149834 | 0 | 0 | 1 | 157500.0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 11 | 2 | 33.914454 | 3.627727 | 1 | 23 |
| 8575 | 36450 | 5149838 | 0 | 0 | 1 | 157500.0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 11 | 2 | 33.914454 | 3.627727 | 1 | 32 |
| 8576 | 36451 | 5150049 | 0 | 0 | 1 | 283500.0 | 4 | 4 | 1 | 1 | 0 | 0 | 0 | 15 | 2 | 49.167334 | 1.793329 | 1 | 9 |
| 8577 | 36452 | 5150337 | 1 | 0 | 1 | 112500.0 | 4 | 4 | 3 | 4 | 0 | 0 | 0 | 8 | 1 | 25.155890 | 3.266323 | 1 | 13 |